Path Integral Networks: End-to-End Differentiable Optimal Control

نویسندگان

  • Masashi Okada
  • Luca Rigazio
  • Takenobu Aoshima
چکیده

In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-Net can learn to plan. PI-Net has several advantages: it can generalize to unseen states thanks to planning, it can be applied to continuous control tasks, and it allows for a wide variety learning schemes, including imitation and reinforcement learning. Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem. We also show that PI-Net is able to learn dynamics and cost models latent in the demonstrations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Social Milieu Oriented Routing: A New Dimension to Enhance Network Security in WSNs

In large-scale wireless sensor networks (WSNs), in order to enhance network security, it is crucial for a trustor node to perform social milieu oriented routing to a target a trustee node to carry out trust evaluation. This challenging social milieu oriented routing with more than one end-to-end Quality of Trust (QoT) constraint has proved to be NP-complete. Heuristic algorithms with polynomial...

متن کامل

Path integral control and state-dependent feedback.

In this paper we address the problem of computing state-dependent feedback controls for path integral control problems. To this end we generalize the path integral control formula and utilize this to construct parametrized state-dependent feedback controllers. In addition, we show a relation between control and importance sampling: Better control, in terms of control cost, yields more efficient...

متن کامل

Acceleration of Gradient-based Path Integral Method for Efficient Optimal and Inverse Optimal Control

This paper deals with a new accelerated path integral method, which iteratively searches optimal controls with a small number of iterations. This study is based on the recent observations that a path integral method for reinforcement learning can be interpreted as gradient descent. This observation also applies to an iterative path integral method for optimal control, which sets a convincing ar...

متن کامل

Path selection in user-controlled circuit-switched optical networks

User-controlled circuit-switched optical networks are gaining popularity in an effort to fulfill the insatiable data transport needs of the online community. In this paper we consider the resource allocation challenges that arise in such networks, in particular problems related to construction of end-to-end lightpaths for carrying large multimedia streams. Specifically, we discuss variations of...

متن کامل

Using a Fuzzy Rule-based Algorithm to Improve Routing in MPLS Networks

Today, the use of wireless and intelligent networks are widely used in many fields such as information technology and networking. There are several types of these networks that MPLS networks are one of these types. However, in MPLS networks there are issues and problems in the design and implementation discussion, for example security, throughput, losses, power consumption and so on. Basically,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1706.09597  شماره 

صفحات  -

تاریخ انتشار 2017